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ABSTRACT 


In an effort to impede the flow of drugs from South America, a Coalition Force headed 
by Joint Interagency Task Force (JIATF)—South allocates its assets to detect and 
interdict drug smuggling vessels such as the self-propelled semi-submersible (SPSS) used 
by a Drug Trafficking Organization (DTO). In this thesis, we develop an interdiction 
model to place the Coalition Force assets optimally. We also develop a model—known 
as the Adaptive Evader Model—for a DTO that is able to learn the placement of the 
Coalition Force assets. This model is akin to the multi-armed bandit problem. We create 
two algorithms for the Adapting Evader Model. One algorithm uses an optimal learning 
policy and the other uses a heuristic learning policy. We also create an algorithm for the 
interdiction model using the Cross-Entropy method. Finally, we construct a case study 
that we use to draw some insights about how a DTO, that is capable of learning, reacts to 
different optimal plans. This information can be used by the Coalition Force to more 


effectively allocate their limited number of assets during drug interdiction operations. 
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EXECUTIVE SUMMARY 


This thesis focuses on a portion of the drug problem; the shipment of cocaine from South 
and Central America into the United States. The major suppliers of these illicit drugs are 
the Colombian drug trafficking organizations (DTOs). One of the vessels that they are 
now using to traffic the cocaine into the United States is the self-propelled semi- 
submersible (SPSS). The SPSSs are a major issue since they are difficult to detect and 
can carry a large amount of cargo. To stop the flow of drugs, the United States has 
partnered with other nations to create a Coalition Force, which is headed by Joint 
Interagency Task Force (JIATF)—South. Since the Coalition Force has a limited number 


of assets, they need to utilize them effectively and efficiently. 


In this thesis, we develop an interdiction model to allocate the Coalition Force 
(interdictor) assets optimally. We also develop a model for a DTO (evader) that is able to 
learn the placement of the interdictor assets. This model is referred to as the Adapting 
Evader Model and is akin to the multi-armed bandit problem. In both models, there are a 
given number of “smuggling” routes available. The interdictor allocates his assets on 


these routes. The evader uses these same routes to transport his SPSS. 


In the Adapting Evader Model, the evader, at each decision point, chooses one 
route with the goal of maximizing his discounted number of successful “smuggling runs.” 
Discounting the number of successes means that it is more beneficial to have a success 
sooner than later. By varying the amount we discount, we change the patience of the 
evader. As the discount gets closer to 1, the evader becomes more patient and is more 
willing to explore the routes, instead of just using the route which he believes has the 
highest probability of success. After trying a route, the evader learns if the SPSS 
successfully traversed the route without being caught. The evader uses the knowledge 
about which routes produce successes versus failures to choose the next route. He 
continues trying routes for an infinite time horizon. We create two algorithms for the 
Adapting Evader Model. One algorithm is exact and uses an optimal learning policy 


based on Gittins indices and the other uses a heuristic learning policy. 


XVil 


The interdiction model is an integer nonlinear program, which minimizes the 
evader’s discounted number of successes by optimally allocating the interdictors assets. 
We create an algorithm for solving the interdiction model using the Cross-Entropy 


method. 


Finally, we construct a case study that we use to draw some insights about how a 
DTO, that is capable of learning, reacts to different optimal plans. In the case study, 
there are five routes and the interdictor has eight assets, which are split into three 
different types each with a different probability of detecting an SPSS. The goal of the 
case study is to gain a better understanding of how the Coalition Force can more 
effectively allocate their limited number of assets to impede the effectiveness of the 
SPSS. One insight is that the interdiction plan is not dependent on the algorithm the 
DTO uses to choose his routes. The exact and heuristic algorithms give about the same 
interdiction plan. Other attributes such as the DTO’s patience and prior belief of success 
on the routes play a more substantial role in designing the interdiction plan. Another 
insight relates to the time it takes the DTO to find the route with the highest probability 
of success, which we refer to as the best route. In the Coalition Force’s worst-case 
scenario, a patient DTO that chooses the routes optimally and does not have a prior belief 
of success, it takes about forty tries to find the best route. To keep the DTO from 
utilizing the knowledge about the best route, the Coalition Force needs to change their 
interdiction plan before they believe the DTO has found the best route. By changing this 
plan before the DTO can take advantage of the knowledge of the best route, the Coalition 
Force can increase the number of SPSSs that they interdict. The models and algorithms 
developed in this thesis provide a means for the Coalition Force to allocate their limited 


number of assets effectively and efficiently. 
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I. INTRODUCTION 


A. SCENARIO OVERVIEW 


Illegal drugs are a large problem in the United States. The National Drug 
Intelligence Center (NDIC), in their National Drug Threat Assessment 2009, states that 
more than 35 million people in 2007 used illicit drugs or abused prescription drugs 
(NDIC, 2009). The Assessment further states, “in September 2008 there were nearly 
100,000 inmates in federal prisons convicted and sentenced for drug offenses, 
representing more than 52 percent of all federal prisoners.” The major suppliers of the 
illicit drugs are the Mexican and Colombian drug trafficking organizations (DTOs). 
They “generate, remove, and launder between $18 billion and $39 billion in wholesale 
drug proceeds annually” (NDIC, 2009). The federal government has allocated more than 
$14 billion in 2009 for drug interdiction, counterdrug law enforcement, international 


counterdrug assistance, and drug treatment and prevention (NDIC, 2009). 


1 Ie Trafficking of Cocaine 


A more specific drug problem is that associated with cocaine. “Analysis of law 
enforcement reporting as well as national drug threat, availability, demand, and treatment 
data indicates that cocaine trafficking is the greatest drug threat to the United States” 
(NDIC, 2009). Almost all of the cocaine transported to the United States started in the 
Andean countries of Colombia, Peru, and Bolivia (GAO, 2008). According to the 
Interagency Assessment of Cocaine Movement (IACM) in 2007 between 545 and 707 
metric tons of cocaine departed South America for the United States (NDIC, 2009). 
Cocaine can be transported from South America by air, land, or sea. Recently, the 
majority is transported by sea (GAO, 2008). There are also currently no reports of the 
cocaine being transported by land (ONDCP, 2008). Figure 1 shows the flow of cocaine 
from South America in 2007 as estimated using confirmed, substantiated, and higher- 


confidence suspect events in the Consolidated Counterdrug Database (NDIC, 2009). 





Source: Interagency Assessment of Cocaine Movement. 


Figure 1. | Documented Cocaine Flow Departing South America, in January-December 
of 2007 (From NDIC, 2009) 


One of the more problematic vessels that the DTOs are using to transport their 
drugs is the self-propelled semi-submersible (SPSS). The SPSS is a boat with a low 
draft. It combines the capacity of a fishing vessel with the small surface size of a go-fast, 
which makes it difficult to find. The majority of the SPSS vessels are between 25-65 feet 
in length, with speeds up to 13 knots, can hold 4-5 crewmembers, and carry up to 10 
metric tons (with an average of 3-6 metric tons) of illicit cargo for distances up to 5000 


NM (with refueling) and 2500 NM without refueling (USCG, 2009). 


The Coast Guard estimates that currently 32 percent of all maritime cocaine that 
flows through the Eastern Pacific is by SPSS. This is expected to increase since only 23 
SPSS events occurred in the 6.5 years before September 2007, but for 9 months after 
September 2007, there have been 62 SPSS events (USCG, 2009). Figure 2 is a picture of 
a boarded SPSS. Figure 3 is a picture of an SPSS under construction. 





Figure 2. An SPSS being boarded (From USCG, 2009) 





Figure 3. | An SPSS under construction (From USCG, 2009) 


Ze Stopping the Trafficking of Cocaine 


To defend the United States from drug trafficking, the Department of Defense is 
the leading federal branch of government. Specifically, the Joint Interagency Task Force 
(JIATF) — South is the United States Southern Command (SOUTHCOM) agency that 
heads all of the interagency and partner nations counter drug operations in the Caribbean 


3 


Sea, Gulf of Mexico, and Eastern Pacific. For counter drug operations in this Area of 
Responsibility (AOR), there are a variety of American and foreign assets that are used for 


detection, monitoring and interdiction. 


The United States Navy and Coast Guard, as well as partner nations such as 
Britain, France, Netherlands, Canada, and Colombia, use ships to help patrol the AOR 
(JIATF, 2009). The ships include United States Navy frigates (see Figure 4), United 
States Coast Guard Cutters, and partner nation frigates. For the interdiction of the 
suspected vessels, there is a Coast Guard Law Enforcement Detachment embarked on the 
US ships, and sometimes on the partner nation ships (JIATF, 2009). Aboard some of the 
United States Navy ships there are helicopter squadron detachments to help with 
detection and monitoring (see Figure 5) (JIATF, 2009). To compliment the Coalition 
Force’s ship operations, the United States and partner nations also use land-based aircraft 
such as maritime patrol aircraft (MPA) and airborne early warning (AEW) aircraft (see 


Figure 6). 





Figure 4. United States Navy Frigate (USS John L. Hall, FFG — 32) (From Jane’s, 
Oliver Hazard Perry class, 2010) 





Figure 5. United States Navy Helicopter (MH-60R) (From Jane’s, Sikorsky MH-60R 
Seahawk, 2010) 





Figure 6. | United States Navy P-3C Orion Airborne Early Warning (AEW) Aircraft 
(From Jane’s, Lockheed P-3C Orion, 2010) 


Currently, the United States is making some significant strides in the battle 
against cocaine trafficking. In the past year, according to the NDIC in 2009, there has 
been a decrease in the availability of domestic cocaine. The reason is unclear, but they 
state that one of the factors is most likely “several exceptionally large cocaine seizures 
made while the drug was in transit toward the United States” (NDIC, 2009). Since, as 


Coast Guard Rear Adm. Joseph L. Nimmich, Commander, JIATF-South stated, "Every 
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time we turn around, the smugglers are extraordinarily creative, extraordinarily adaptive," 
the United States needs to continue to improve on the way it interdicts the drug 
smuggler’s vessels (SOUTHCOM, 2009). One of the ways to improve is with smarter 


methods of employing the Coalition Force’s search and interdiction platforms. 


B. SCOPE, GOAL, AND BENEFITS OF STUDY 


This thesis focuses on a portion of the drug problem; the shipment of cocaine 
from South America into the United States, using the SPSS. The SPSS is a good 
smuggling vessel; it is difficult to detect with its low profile and it is able to carry a large 
cargo. The SPSS also has the capability of smuggling other contraband such as weapons 
of mass destruction. For this reason, the goal of this thesis is to find a better way to 


allocate the Coalition Force assets to increase the number of interdictions of SPSSs. 


This thesis develops a model, which produces the optimal allocation of Coalition 
Force assets against a DTO that is capable of learning the location of those assets. This 
model will have operational value when used as a decision aid. It also provides a means 


to delve into the decision process of a DTO choosing where to use his SPSSs. 


C. THESIS ORGANIZATION 


This thesis is organized with Chapter I presenting an overview of the problem and 
the main players. Chapter II describes the scenario and lists other references that have 
delved into this topic. The models and algorithms used in this thesis are presented in 
Chapter III. A numerical case study is presented and analyzed in Chapter IV. Chapter V 
summarizes this thesis with conclusions. In Appendix A, there are tables of pre- 
computed Gittins Indices and values to estimate Gittins Index. Appendix B describes 
how to calculate the number of possible interdiction plans. Appendix C describes how 
the probability of success on each route is computed for the numerical case study in 


Chapter IV. Appendix D is a graphical representation of the results from the Case Study. 


Hi. PROBLEM DESCRIPTION 


A. SITUATION 


An evader (i.e., DTO) and an interdictor (i.e., Coalition Force) are operating 
against each other in an environment consisting of n routes. A route consists of a 
starting location, a path, and a destination. For instance, Figure 1 shows an environment 
with seven routes (n=7). Both the evader and interdictor know the routes and they 


cannot change them. 


The evader travels these routes with his vessels. The vessels go one at a time 
across a route in hopes of a successful trip. A successful trip means that the vessel makes 


it to its destination. Associated with route i, i=1, 2, ..., n, 1s the probability p, that a 
trip on this route completed successfully. The evader does not know p, and thus must try 


to learn these probabilities. The only way for the evader to learn is to try the route and 
experience the outcome. An outcome will be either a success or a failure. Each try of a 
route takes a time period (e.g., a week). At the beginning of time period t a vessel 
becomes available. The evader will then choose a route, say route i, for the vessel to 
traverse. By the end of the time period, the evader knows if the vessel succeeds or fails. 


With this knowledge about route i, the evader updates his belief, 6.(t), on route i for 
time period t. @(t) is a random variable that represents the evader’s belief about the 


probability of success of traversing route iat time period t. The evader does not update 


his belief (pdf of @(t)) on any of the other routes for time period t. With the updated 
belief, the evader now uses his beliefs on the routes (@,(t+1), i=1, 2, ..., n) to choose a 


route to use for the next vessel during the following time period (t+1). This process 
continues for an infinite time horizon. The evader has the option to explore different 
routes and risk a vessel now to gain knowledge for later use. The evader aims to find a 
best route so he can maximize his expected number of discounted successes. A best route 
is the route with a probability of success that is equal to or greater than the respective 


probabilities of the other routes. It is possible to have multiple best routes. By finding 


the best route as soon as possible, the evader maximizes his expected number of 
discounted successes. The willingness of the evader to search for the best route is 
dependent on the evader’s patience. This patience can be represented using a discount 
factor (Fudenberg & Levine, 1998). The discount factor decreases the value of a success 
to the evader as t increases. At each time period, the value of a success to the evader is 
exponentially discounted. We denote the discount factor by a, which is between zero 
and one. As a gets closer to one, the evader gets more patient. A more patient evader 


will explore the routes longer than an impatient evader. 


The interdictor determines the probabilities of success on the routes that the 
evader is trying to learn. He does this by allocating his limited number of assets on the n 


routes. The assets assigned to route i yields a specific p,. The interdictor has m 
different types of assets (j= 1,2, ..., m), which have different probabilities of detecting 
a vessel. The corresponding probability of detecting a vessel for asset j on route 7 is 
g,,- An asset’s probability of detecting a vessel is independent of the other assets’ 
probability of detecting a vessel. Let X; ,be the number of assets of type j on route i 
and let X be the vector with components Xj for all i=1, 2,..., n and j=1,2,..., m. 
We refer to X as an interdiction plan. We denote the number of assets of type j 
available by r,. We let p,(X) be the probability of success along a route given an 
interdiction plan X , which we compute using ¢, ;; see Appendix A. The interdictor aims 


to minimize the evader’s expected number of discounted successes over the infinite time 


horizon. 


B. LITERATURE REVIEW 


Using mathematical models to allocate resources that interdict the flow of 


contraband is not new. Wood (1993) uses network interdiction models! to help decide 





! An interdiction model is also known as an attacker-defender model. This model is a Stackelberg 
game (von Stackelberg, 1952). In a Stackelberg game two sides play sequentially. The interdictor goes 
first followed by the evader. For a more detailed explanation and extension of these kind of models, see 
Brown et al. (2006). 


where to allocate a limited number of resources to interdict drugs in South America. He 
develops flexible integer programming models to minimize the maximum flow that can 


be pushed through a capacitated network. 


More recently, Pfeiff (2009) uses an interdiction model to help the Coalition 
Force allocate their assets to catch the most SPSSs, in both the Eastern Pacific and the 
Caribbean. The solution to his model is a mixed strategy for the interdictor and a least 


risk path for the evader. 


Washburn and Wood (1995) develop game-theoretic models for the drug flow 
along a network of roads and rivers in parts of South America. They use a network 
interdiction model to solve this two-person zero-sum game, with the solution being the 


maximum expected equilibrium. 


Dimitrov et al. (2009) use dynamic programming in their model to show the 
optimal places to build stationary radiation detectors on a transportation network. The 
nuclear-material smuggler in the model is adaptive, but also has full knowledge of the 


detector locations and evasion probabilities. 


Bailey et al. (1994) model the interaction of the United States Coast Guard cutters 
and smuggling vessels using Monte Carlo simulation. The evader chooses its route using 
a sequence of finite horizon dynamic programs. The dynamic programs take into account 
an evader that is “forced to combine his short-run profit goals with his need to gain future 
information about the configuration of the cutters” (Bailey et al., 1994). This 
configuration of the interdictors becomes a discrete-time Markov chain from the evader’s 


point of view. 


Caulkins et al. (1993) model how interdicting cocaine shipments affect smuggling 
costs using dynamic programming and Monte Carlo simulation. The evader has three 
modes of transportation: air, sea, and land. Each mode can use a set number of routes. 
The routes are treated as generic since the routes in the model are not associated with any 
particular geographic boundaries (Caulkins et al., 1993). The evader uses a heuristic 


algorithm to choose a route in the resulting model, which is a multi-armed bandit 


problem.? The heuristic algorithm randomly chooses a route based on time-weighted 
estimates of the probability of interdiction. The interdictor in this model is only 


represented by a probability of interdiction on each route and not by any physical asset. 


The first four papers differ from this thesis by the modeling of the evader. Wood 
(1993) and Pfeiff (2009) both have an evader that does not know nor is able to learn the 
interdictor’s plan. Washburn & Wood (1995) and Dimitrov et al. (2009) both have an 
evader that already knows the interdictor’s plan in some sense. In Washborn & Wood 
(1995), the interdictor’s plan is a mixed strategy. In Dimitrov et al. (2009), the interdict’s 
plan is the location of the detectors. The last two papers differ from this thesis by the 
modeling of the interdictor. Bailey et al. (1994) and Caulkins et al. (1993) both have an 
evader that is adaptive and capable of learning the interdictor’s plan. However, the 
interdictor’s plan, in both these papers, is determined by the user and is not optimized. 
This thesis develops models that find the optimal interdiction plan against an evader that 


is adaptive and capable of learning this plan. 





2 The first person to pose the bandit problem was W. Thompson in 1933 (Berry & Fristedt, 1985). 
Thompson (1933, 1935) uses the Beta distribution with Bayesian updating to maximize the expected 
number of successes in the first T pulls for two arms. There was little interest in the bandit problem until 
1952 when H. Robbins wrote a paper on it. Since Robbins, many people have explored the multi-armed 
bandit problem and its extensions, which are laid out in Berry and Fristedt’s book (1985). 
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Hl. MODEL AND ALGORITHM DEVELOPMENT 


A. OPTIMIZATION MODELS FOR THE EVADER AND INTERDICTOR 
1. The Evader Optimization Model 


We develop a model for an evader that is able to learn the placement of the 
interdictor’s assets which we refer to as the Adapting Evader Model. In our model, the 
evader’s belief about the probability of success of the routes is represented by a beta 
distribution, which is also in Thompson (1933, 1935); see comment on page 10. The 
beta distribution is able to account for the two parts associated with a belief. The first 
part is the believed probability of success and the second part is the uncertainty 
associated with that probability. A beta distribution can represent these parts using its 


parameters a and f. The believed probability is the mean of the beta distribution and is 





calculated by 
a 
Eyl 
at+p on 
The confidence of the belief is represented by the variance of the beta distribution and is 
calculated by 
a, 
a (3.2) 
(a+) (a+ f+) 


@.(t) has a beta distribution with parameters a, and f, for time period +. Using 
Bayesian updating, if route i is used and the trip is a success, then @, is increased by one, 
but if the trip is a failure then / is increased by one. The Bayesian updating is possible 
since the beta distribution is a conjugate pair with the Bernoulli distribution. To start the 
Bayesian update, the evader needs to have an initial belief on route i. We denote the 
initial belief on route i by a? and %., which is an input to the model. The initial belief 
gives modeling flexibility. It allows the modeling of different prior disposition about the 
routes. The Adapting Evader Model follows below and takes the form of a multi-armed 


bandit problem. 
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Adapting Evader Model 


Index 
i routes (i =1,2,...,n) 
Data 
a discount factor 
0 ewig 0 (0 .0 0 
a; initial value of alpha on route i;a = (a, Fo NO A ) 
B initial value of beta on route i; B° = (Bir ob.) 
D; actual probability of success on route i 
State 
a, current value of alpha on route i 
B, current value of beta on route i 
Function 


R(a,B) =max! p, [a+ar((q, +1,@,,...@,),8) | 
+(1= p,)[ 0+aR(a.(B, +1, f,,...,8,)) |, 
P [a+ ar ((a,.a, +1,...,4,),8) | 
+(1-p,)| 0+aR(a,(2.2, +1....8,)) ], (3.3) 


eeey 


Pr [a+ aR((Q,05,....0, +1), 8) | 
+(1-p,)[0+aR(a,(B,.B..-B, +1))]} 


Formulation 


Determine R (2°, B° ) 


The Adapting Evader Model is a dynamic program with Bellman’s equation given by 


Equation 3.3. The current reward in Equation 3.3 is the expected discounted success for 
that time period, which is p, (a) for a success and (1- p;)(a) for a failure. The future 
rewards are the discounted successes for a later time period, which is 
D; [ aR (a. 15, +1, Ciegeia. 8). for a success and 
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(1- p,) [ ak (a,(B, BB ALB B, ))| for a failure. To solve the Adaptive Evader 


Model using the backward-recursion dynamic programming algorithm is difficult and we 
instead utilize a procedure for determining an optimal policy based on Gittins indices3 


and a heuristic policy as we describe in Section B. 


Zz. The Interdictor Optimization Model 


We develop an interdiction model that finds the optimal allocation of the 
interdictor’s assets, which we refer to as the Asset Allocation Model. The solution to the 
objective function in our model is the minimum of the maximum expected number of 
discounted successes as defined by the Adaptive Evader Model. The Asset Allocation 
Model consists of minimizing this maximum by selecting an interdiction plan as 


described next. 





3 Gittins & Jones (1974) solve the multi-armed bandit problem optimally using Gittins Indices. Gittins 
Indices are optimal under the following assumptions: exponential discounting, an infinite time horizon, and 
beliefs about routes that are not chosen cannot change. Since then, others such as Whittle (1980), Varaiya 
et al. (1985), Tsitsiklis (1986), Gittins (1989), Weber (1992), Tsitsiklis (1994), Bertsimas & Nifio-Mora 
(1996), and Dacre et al. (1999) also have proven the optimality of Gittins index. 
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Asset Allocation Model 





Index 

i routes (i =1,2,...,n) 

J asset types (j =1,2,...,m) 

Data 

a discount factor 

ie initial value of alpha on route i;a° = (ats@),. ae) 
B initial value of beta on route i; B° = (PesBe e a) 
r total number of assets of type j 

Decision Variables 

X; number of assets of type j on routei 

Functions 

D; (Xx ) actual probability of success on routei given interdiction plan X ; 


XS (KX Moe ors Mee) 

f(X;a°, B°)= R(a" ,p ) expected number of discounted successes given interdiction plan X 
and a and B° as defined by R(a’, B ° in Adapting Evader Model 
with p, (X ) replacing p, foralli 


Formulation 
min f(X;a", 6°) 
S.t. 
Di, =1, Vj 
X,, €{0,1....7;} Vi, j 


The Asset Allocation Model minimizes the optimization function of the Adaptive Evader 


Model with the optimal interdiction plan. 
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B. OPTIMIZATION ALGORITHMS 
1, The Evader Optimization Algorithm 


This thesis develops two algorithms for the Adaptive Evader Model. The first 
uses an optimal policy, which we denote Gittins Choice Algorithm. The second uses a 
heuristic policy, which we denote Decreasing Choice Algorithm. In both of these 
algorithms, we let T be the maximum number of time periods with t=1, 2, ..., T. We 
set T to be a large number to represent an infinite time horizon. In the Decreasing 


Choice Algorithm, we let ¢, be a user defined parameter and ¢, be a variable that 


changes with time. 


a. Optimal Evader Optimization Algorithm 


We use Gittins Index to solve the evader optimization model optimally. 
By using Gittins Index the evader’s n-dimensional problem, which is choosing a route at 
each time period to achieve the highest expected number of discounted successes in an 
infinite horizon, becomes n one-dimensional problems (Varaiya et al., 1985). Gittins 
(1989) provides an equation to estimate the Index and tables of pre-computed Indices. 
The equation and tables depend directly on a, since a dictates how much the evader 
wants to explore versus exploit. The equation and tables for a=0.50 and a=0.99 are 


given in Appendix B. 


To use Gittins Indices, a Gittins Index is calculated for each of the n 
routes. The evader chooses to use the route with the highest index. After learning if the 
vessel completes its trip successfully, the evader then recalculates the Index for the 
chosen route. The process continues for an infinite time horizon. 

As an illustration of Gittins Index, consider an evader that has two routes 
to choose from (1 =2) and is patient (a =0.99). The evader’s prior belief on Route | is 
manifested in @=1 and G=1. The evader’s prior belief on Route 2 is manifested in 
a=30 and £=10. If the evader chooses a route based on expected probability of 
success alone (Equation 3.1), then he would choose Route 2 since 0.75 is greater than 0.5. 
Using Gittins Index, in this example, the evader chooses Route | since its Gittins index is 


Ibs) 


larger. Since there is more uncertainty around Route 1 compared to Route 2, Route | has 
a higher potential to have a high probability of success. In this instance, using Gittins 
Index, the evader will explore instead of exploit, since he chooses the route with more 
uncertainty over the route with a higher expected probability of success. Using the same 
example with an impatient evader (a = 0.50) changes the outcome. The evader chooses 


to exploit instead of exploring and selects Route 2. 


Below is the Gittins Choice Algorithm, which solves the Adapting Evader 
Model. The policy determined by the Gittins Choice Algorithm is optimal, but the 
optimal value (expected number of discounted successes) is estimated using Monte Carlo 


simulation. 


Gittins Choice Algorithm 
1. Set initial conditions 
T : Maximum time; t = 1; a: discount factor; n: routes (i= 1, 2,..., n) 


a, B°: Evader’s initial belief about the probability of success on the route i. 


p,;: Actual probability of a success on route i. 
2. Calculate Gittins Index for all of the routes using the initial beliefs. 
3. Choose the route with the highest index (route 7). 


4. Randomly decide if the run was a success using the actual probability on route i (p,). 
If run was a success, increase the optimal value by the value a’. 

5. Update the belief on route 7. 
If run was a success increase @, by 1. Else, increase £, by 1 for route 7. 


6. Recalculate Gittins Index for route 7. 
7. Increase t by 1. 
8. If t>7, then stop. Else, go to step 3. 


b. Heuristic Evader Optimization Algorithm 


The evader could choose from many heuristic policies. For instance, the 


evader could just explore at every time period by randomly choosing a route. At the 
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other extreme, the evader could exploit at every time period; he would choose the route 
with the highest expected probability of success. We consider a heuristic policy that is 


something in between, the ¢ -decreasing strategy; see Vermorel & Mohri (2005). 


In each time period, using the ¢-decreasing strategy, the evader randomly 
decides whether to explore (with probability ¢,) or exploit (with probability 1—e,) 


where: 
é 
€,=min {1} (3.4) 
t 
and ¢, is a non-negative value set by the user. ¢, is the evader’s patience. A higher 
value for ¢, means the evader will explore for a longer time before he starts to exploit. If 


€) = 0 then the evader will always exploit and choose the route with the highest believed 


probability of success. The believed probability of success on each route is the mean of 
the beta distribution and is calculated by Equation 3.1. ¢, is monotonically decreasing as 
time (ft) increases. This means that as time increases the evader is more likely to exploit 
than to explore. There will also be a point (depending on the value of ¢,) when the 


evader will exploit with a very high probability. 


Below is the Decreasing Choice Algorithm, which uses a heuristic policy 
for the Adapting Evader Model. This algorithm uses Monte Carlo simulation to evaluate 


the value of the policy. 


Decreasing Choice Algorithm 
1. Set initial conditions 
T : Maximum time; t = 1; a: discount factor 
é,: Non-negative value to represent the evader’s patience. 
n: Routes (i= 1, 2,..., n) 
a, $B: Evader’s initial belief about the probability of success on the route i. 
p,: Actual probability of a successful run on route 7. 
2. Calculate the believed probability of success on the routes using Equation 3.1. 
3. Calculate ¢, using Equation 3.4. 
4. Randomly choose a number (6), where 6 €[0,1]. 
If 6>e, then choose the route with highest believed probability of success. 


Else, randomly choose a route. 
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5. Randomly decide if the run was a success using the actual probability on route i (p,). 
If run was a success, then increase the optimal value by the value a’. 
6. Update the belief on route 7. 
If run was a success, then increase a, by 1. Else, increase £, by 1 for route 7. 
7. Recalculate the believed probability of success on route i using Equation 3.1. 
Increase ¢ by 1. 
9. If t>T,, then stop. Else go to step 3. 


oe 


2. The Interdictor Optimization Algorithm 


We solve the Asset Allocation Model by the Cross-Entropy (CE) method 
(Rubinstein & Kroese 2004, Allon et al., 2004). The CE method is an iterative method 
that generates a sequence of sets of interdiction plans according to a specified random 


mechanism. We denote the random mechanism by y’ for iteration /. This random 


mechanism generates a user specified number of interdiction plans G' in iteration /. As 
in Allon et al. (2004), we do not change the number of interdiction plans between 
iterations and drop the superscript henceforth. We let g=1, 2, ..., G be the interdiction 


lan counter. We then let X' be the 2” interdiction plan generated b ‘in iteration 7. 
p g plan g yy 


The random mechanism induces a probability mass function (pmf) for each asset 
type and each route, except Route n. Route n is excluded since it receives the assets 
that have not already been allocated to the previous routes. We denote the probability for 


route i to have k assets of type j in iteration / by Wi. jx: The random mechanism 


consists of the collection of these pmf, i.e. y' = {Vita <i<nl< j<m,0<k< cs . The 





initial pmf for each route is uniform, which means that the probability for route i to have 
any number of assets (within the limit of the total number of assets available) is equally 
likely. The pmf for Route 1 and asset type j in iteration /is VW, re Vi, vine dee Vi, sis 
where r, is the number of assets of type j. Table | is an example of an initial pmf for 


four routes and five assets for asset type j. The hyphen in Table | on Route 4 delineates 


that there is no pmf on Route 4, since the number of assets that have not been allocated 


on Routes 1, 2, or 3 will be allocated on Route 4. 
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Number of Assets on the Route of Type j 
Route 0 1 2; 3 4 cS) 


2 
3 
A leceeeall Bete IC ods oy a | 





Table 1. An example of an initial pmf for four routes (1 = 4) and five assets 
( r= 5) 


The possible number of assets that can be allocated on a route is dependent on the 


placement of assets on previous routes. To generate the plan for asset of type jin 


iteration 1, we randomly choose the number of assets for Route | according to its pmf. 
Given the number of assets of type j on Route | is y, we construct a renormalized 
distribution for Route 2 as shown in Equation 3.5 
Wo,j 
tt for Vik =0,1,...,7,-y (3.5) 
W540 TWaja Fo TWh ry 
and generate the number of assets on Route 2. Given the number of assets of type j on 
Route | and Route 2 is y , we construct a renormalized distribution for Route 3 as shown 
in Equation 3.6 
1 
Ws ik 
l u I 
Ws i0 TVs, 51 Fe Ts, 52-7 





for Vj;k =0,1,...,7,-y (3.6) 


and generate the number of assets on Route 3. We continue this procedure for all of the 


other routes and the other asset types to compute X/’*, which is the number of assets of 


i,j ? 
type j on route i in interdiction plan X'*. Table 2 is an example of a renormalized 
distribution for Route 3, from the previous example of Table 1. The black squares in 
Table 2 denote asset allocations that are no longer possible since some assets are 


allocated on previous routes. Notice that the probabilities on Route 3 still add up to one. 
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Number of Assets on the Route of Type j 


Route 





Table 2. An example of a renormalized distribution for route three with a total of 
four routes (1 = 4) and five assets (7, =5 ) 


After all of the assets are allocated on the routes, we compute the probability of 


success for each route. We denote the probability of success on each route in iteration | 
for the g” interdiction plan X'* by p(X") , 1=1, 2, ..., n; see Appendix A, Section B 


for calculating these probabilities. We use the solution from either the Gittins Choice 
Algorithm or the Decreasing Choice Algorithm with these probabilities as the 
corresponding function values. The function values are then used to update the random 
mechanism, as we describe next. To update the random mechanism with the best 


interdiction plans, we only use the top 100p percent of function values, where 
pe [0,1] and is set by the user. Since we want to minimize the function value, the best 
interdiction plans are those associated with the lowest function values. We denote y as 
the function value for the interdiction plan that has the | pG | lowest function value, 
where | | means to round up to the next integer. To update y;,,, we use the 


percentage of times that asset k on route i is used in the plans that have a function value 


less than vy. Equation 3.7 shows the mathematical formula to update the probabilities: 





2gat se Vi, j,k (3.7) 
DH xis? 


where J is an expression that is a one when the Boolean expression is true and zero 


Boolean 


when it is false. We compute the random mechanism, y'*', for iteration /+1 by 
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yl" =ay' +(I-A)y' (3.8) 


where vy is the collection an lstsnl<jsmj0sks r| given in Equation 3.7 





and A is a smoothing parameter. 4 is a value between zero and one, given by the user, 


with 2 between 0.7 and 0.9 giving the best results (Allon et el., 2004). 


The stopping criterion of the CE method is based on the convergence of the 
sequence of y', which in the long term will converge to a degenerate matrix under 


certain assumptions (Allon et el., 2004). Our stopping criterion is when the same asset 
allocation for each route has the highest probability for C consecutive iterations. The 


CE Asset Allocation Algorithm stated in detail, below. 


CE Asset Allocation Algorithm 


1. Set initial conditions 
n: Number of routes (i= 1, 2, ..., 1) 
m: Number of asset types (j= 1, 2, ..., m) 


r,: Total number of assets of type j (k=0, 1, ..., 7;) 
g, ;: Probability for asset of type j detecting a vessel on route i 


G: Number of plans chosen for each random mechanism 
C: Number of times that the highest probability for each route for each asset 
in the probability distribution is the same before terminating (stopping 
criteria) 
p.: Percentage of plans that have the lowest reward 
A: A smoothing parameter for Equation 3.8 

Set c=l, g=l, and /=1 

3. Construct the random mechanism (Vy; j«) using a uniform distribution for each asset 
type ( j ) on each route (7) distributed across the asset allocations (k ) 

4. Choose X'* according to y’' 

5. Calculate p,(X'*) for all i using Equation A.1 with p,(X'*) replacing p,(X) 
Calculate the value from the Gittins Choice Algorithm or Decreasing Choice 
Algorithm with p, (x us ) replacing p, for all i 

7. Increase g by 1 

8. If g <G then go back to step 4. Else, solve for y and increase / by 1 

9. Update w' according to Equation 3.8 
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10. If argmax y;;,=argmax y;,, for each asset type (j) on each route (i) then 
k k 


increase c by 1. Else, reset c=1. 
11. If c<C then reset g=1 and go to step 4. Else, stop and output the interdiction plan 


a 


a. Pe a, PF at 1 
x = (X11, X1ay...,X nm), where Xij =argmax YW; ;,- 
k 
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IV. NUMERICAL RESULTS 


A. CASE STUDY 


In this chapter, we present a case study that addresses a specific scenario, the 
inputs into the algorithms, the results of multiple runs, and the insights resulting from this 
analysis. The evader in our scenario is a DTO who is using SPSSs to transport his drugs 
out of Colombia by the sea. The interdictor in our scenario is the Coalition Force who is 


using its assets to interdict the SPSSs. 


1. Routes 


The environment consists of five drug shipment routes from Colombia. Figure 7 
shows the location of the routes graphically. Route 1 goes to the Galapagos Islands and 
then up to Central America. Route 2 goes through the Eastern Pacific along the coast to 
Central America. Route 3 goes through the Caribbean Sea along the coast to Central 
America. Route 4 goes through the Caribbean Sea into the Gulf of Mexico to Mexico. 
Route 5 goes through the Caribbean Sea to Cuba. 


4 # ’ 
> a 14 r se 
© 2040 tropa Technologies ; : G | 5 
O"Dept of State Geographer | 4 my t yh BE odo sOOR € 
@ © 2010 MapLink/Tele Atlas } . ay as Fe) 4 
© 2010 Google & ‘ 
10°35'49.77"N __84°42'26.31"W__ elev 78m Eye alt 5202.18 km 





Figure 7. | A map showing the five routes (From Google Earth). 
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2. Evader 

We solve the Adapting Evader Model with the Gittins Choice Algorithm and also 
the Decreasing Choice Algorithm, which yield an optimal and heuristic policy, 
respectively. The DTO is either patient (a=0.99 and ¢, =10) or impatient (a =0.50 
and ¢,=0). When the DTO is patient, he explores more often. For instance, using the 
Decreasing Choice Algorithm with ¢, =10the DTO randomly explores for at least ten 
time periods. If the DTO is impatient then a success now significantly outweighs the 
value of a success later and thus there is no time to explore; the DTO always exploits. 
The DTO may have a predetermined belief which is manifested in @° and f° of the beta 
distribution. A predetermined belief is when the DTO initially is more likely to choose 


one route over another. Table 3 shows the a° and f° for the DTO with a predetermined 


belief, while Table 4 shows the a° and f° for a DTO without a predetermined belief. 


Beta Distribution 
Route | Initial Alpha | Initial Beta ite [Vara 
Mean | Variance 


oe 
3010 [0.500.071 
as iso 0027 
paso 025 0.0086 


Table 3. Initial belief on the routes for a DTO with a predetermined belief 


| Beta Distribution | Disiihavon 
Route Initial Alpha Initial Beta | Mean | 
Mean 


por | ut 0.50} 0.0833 | 





Pp 2 | tt 0.50] 0.0833 | 
Pp 3 | tt 0.50} 0.0833 | 
po 4 | tf 0.50 | 0.0833 | 
| 5 | ot Tt 0.50} 0.0833 | 








Table 4. Initial belief on the routes for a DTO without a predetermined belief 


24 


There are eight different cases obtained from the three characteristics — decision process, 


patience, and predetermined belief—of the DTO, which are shown in Table 5 


PEP Opin | rain [Yes 
| 2 | Optimal | Patient | No 
| 3 | Optimal | Impatient _| 


[opin Fiat 
a 
[8 | Heurisic [Impatient | 











Table 5. Cases for the DTO using the Gittins Choice Algorithm for the optimal 
decision process and the Decreasing Choice Algorithm for the heuristic decision 
process 
J Interdictor 


The Coalition Force has three asset types ( j=1, 2, 3) for their use: four Navy P-3 
aircraft (j =1), two Navy E-2 aircraft (j =2), and two Coast Guard Cutters ( j =3). 


There are 15,750 possible interdiction plans for the eight assets allocated on the five 


routes. Appendix C explains how to compute the total number of plans. 


Each individual asset type has a probability of detecting an SPSS, which depends 
upon the route it is searching. The assets perform a barrier search on their assigned route. 
To calculate the probability of detecting an SPSS we use the inverse cube law for an 
aircraft and the linear law for a ship. These probabilities are listed in Table 6. For a 


more detailed explanation of these probabilities, see Appendix A. 


Route (i) 
2 
3 
4 
5 


Table 6. The probabilities of each Coalition Force asset detecting an SPSS on each 
route for the case study (¢, ; ) 
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B. NUMERICAL RESULTS 


The results depicted below are obtained from running the CE Asset Allocation 
Algorithm with either the Gittins Choice Algorithm or the Decreasing Choice Algorithm 
described in Chapter III, using VBA for Microsoft Excel. All of the simulations use the 
same seed of 300 for the random number generator. The values for the parameters for the 
Gittins Choice Algorithm and the Decreasing Choice Algorithm are: T=298; a=0.99 
for a patient DTO and a=0.50 for an impatient DTO; ¢,=10 for a patient DTO and ¢,=0 


for an impatient DTO; a@;’and ” are in Table 3 and Table 4. We use a value of 298 for 


T , because at time period 298 a discounted success is only worth 0.05. The values in the 








parameters for the CE Asset Allocation Algorithm are: n=5; m=3; r,=4, r,=2, and 


r,=2; ¢,, are in Table 6, G=1000; C=20; p=0.01; and 2=0.9. 


1. Optimal Interdiction Plans 


Tables 7 — 14 give the interdiction plans from the CE Asset Allocation Algorithm 
for the cases from Table 5. For instance, Plan | is the interdiction plan that the CE Asset 
Allocation Algorithm develops for Case 1. We use the Gittins Choice Algorithm in the 
CE Asset Allocation Algorithm to choose Plans 1 — 4. We use the Decreasing Choice 
Algorithm in the CE Asset Allocation Algorithm to choose Plans 5 — 8. We note that a 
patient DTO (as in Cases 1, 2, 5, and 6) would have had about 94 discounted successes if 
he is successful for every time period. We also note that an impatient DTO (as in Cases 3, 
4, 7, and 8) would have about 1 discounted success if he is successful for every time 


period. We denote this value as the maximum possible number of discounted successes. 


Table 7 gives the interdiction plan corresponding to Case 1. The majority of the 
assets are located on Routes | and 2. The only other route being searched is Route 4 with 


an E-2. 
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Table 7. Interdiction plan for Case 1 with an expected number of discounted 
successes of 50.1 


Table 8 gives the interdiction plan corresponding to Case 2. The assets are 


equally dispersed amongst all of the routes. 





Probability of Success: p ;(X) | 9.7097 | 0.7018 | 0.6854 | 0.8324 | 0.7064 


Table 8. Interdiction plan for Case 2 with an expected number of discounted 
successes of 72.6 


Table 9 gives the interdiction plan corresponding to Case 3. All of the assets are 


located on Routes 1 and 2. 





Probability of Success: p ;(X) | 0.7134 | 0.1223 | 1.0000 | 1.0000 | 1.0000 


Table 9. Interdiction plan for Case 3 with an expected number of discounted 
successes of 0.123 


Table 10 gives the interdiction plan corresponding to Case 4. The assets are 


dispersed amongst all of the routes except Route 4. 


2]. 





Probability of Success: p ;(X) | 0.77729 | 0.58656 | 0.68539 | 1 | 0.54486 


Table 10. Interdiction plan for Case 4 with an expected number of discounted 
successes of 0.745 


Table 11 gives the interdiction plan corresponding to Case 5. The assets are 


dispersed amongst all of the routes except Route 5. 





Probability of Success: p ;(X) | 0.5788 | 0.4566 | 0.9544 | 0.6854 | 1.0000 


Table 11. Interdiction plan for Case 5 with an expected number of discounted 
successes of 59.9 


Table 12 gives the interdiction plan corresponding to Case 6. The assets are 


equally dispersed amongst all of the routes. 





Probability of Success: p ;(X) | 0.8446 | 0.7018 | 0.6854 | 0.6854 | 0.7064 


Table 12. Interdiction plan for Case 6 with an expected number of discounted 
successes of 72.5 


Table 13 gives the interdiction plan corresponding to Case 7. All of the assets are 


located on Routes 1 and 2. 
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Probability of Success: p ;(X) | 0.6565 | 0.1463 | 1.0000 | 1.0000 | 1.0000 


Table 13. Interdiction plan for Case 7 with an expected number of discounted 
successes of 0.151 


Table 14 gives the interdiction plan corresponding to Case 8. The assets are 


equally dispersed amongst all of the routes. 





Probability of Success: p ;(X) | 0.8446 | 0.3497 | 0.6854 | 0.8721 | 0.7927 


Table 14. Interdiction plan for Case 8 with an expected number of discounted 
successes of 0.745 


Tables 9 and 13 are very similar with all of the assets assigned to the first two 
routes. Table 7 also has all of the assets assigned to the first two routes except for one 
asset on route four. All three of these cases have a DTO that has a predetermined belief. 
Table 11 also has a DTO with a predetermined belief, but the assets are dispersed over 
four out of the five routes. Most of the assets are on Route 1. Tables 8, 12, and 14 are 
similar with the assets evenly dispersed over the routes. The DTO in each of these cases 
does not have a predetermined belief. In Table 10, the DTO does not have predetermined 


belief, and the assets are allocated on all of the routes except for one. 


2 Comparing the Plans 


To better understand which characteristics of the DTO drives the CE Asset 
Allocation Algorithm to choose an interdiction plan, we create sixty-four sub cases. Each 


sub case is a combination of a case and a plan. We use the Gittins Choice Algorithm and 
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the Decreasing Choice Algorithm to estimate their expected number of successes with 


1000 replications. See Appendix D for the full case study results. 


Since the sub cases do not all have the same maximum possible value for the 
number of discounted successes, we standardize the values. We use percentage of 
discounted successes as defined by the fraction 


number of discounted successes (5.1) 


maximum possible number of discounted successes 

To explore the results, we group the sub cases in pairs. In each pair, the sub cases 

have one out of the three characteristics of the DTO different, and the other two the same. 
There are eight pairs for each set of analysis. To be 95% confident in our findings, 
involving eight combinations, we use a confidence level of 100%(1—0.025/8) for the 


individual sub cases. 


a. Optimal Versus Heuristic Decision Process by the DTO 


The sub cases in Figure 8 are grouped by the DTO’s decision process. In 
one sub case the DTO chooses routes optimally and in the other sub case the DTO 
chooses routes heuristically. The other two characteristics, which are the DTO’s patience 
and initial belief, are the same within the group. For instance, Case | is a patient DTO 
that chooses the routes optimally and has a predetermined belief. Plan 1 is the 
interdiction plan against Case 1. Case 5 is a patient DTO that chooses the routes 
heuristically and has a predetermined belief. Plan 5 is the interdiction plan against Case 


5. 
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Figure 8. _ Pairings of optimal versus heuristic sub cases 


All of the pairs have a close percentage of discount successes, with four out of the eight 
pairs statistically indistinguishable. Those four pairs are Case 1 with Plans 5 and 6, Case 
3 with Plans 3 and 7, Case 4 with Plans 4 and 8, and Case 8 with Plans 4 and 8. When 
comparing the interdiction plans, the plans chosen by the CE Asset Allocation Algorithm 
where the case only differs by the decision process are very similar; see Tables 7 — 14. 
This is indicative that the decision process of the DTO does not play a role when the CE 


Asset Allocation Algorithm chooses the interdiction plan. 


b. Patient Versus Impatient DTO 


The sub cases in Figure 9 are grouped by the DTO’s patience. In one sub 
case the DTO is patient and in the other sub case the DTO is impatient. The other two 
characteristics, which are the DTO’s decision process and initial belief, are the same 


within the group. 
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Figure 9. _ Pairings of patient versus impatient sub cases 


Two out of the eight pairs are statistically indistinguishable. Those two are Case 4 with 
Plans 2 and 4, and Case 8 with Plans 6 and 8. When comparing the interdiction plans, 
the plans chosen by the CE Asset Allocation Algorithm where the case only differs by the 
patience are close with a couple assets different; see Tables 7 — 14. The patience of the 
DTO thus plays a role when the CE Asset Allocation Algorithm chooses the interdiction 
plan. 


c Predetermined Belief Versus No Predetermined Belief 


The sub cases in Figure 10 are grouped by the DTO’s initial belief. In one 
sub case the DTO has a predetermined belief and in the other sub case the DTO does not 
have a predetermined belief. If a DTO has a predetermined belief then he starts the 
scenario more willing to use some routes over other routes. A DTO with no 
predetermined belief starts the scenario willing to try every route equally. The other two 
characteristics, which are the DTO’s decision process and patience, are the same within 
the group. 
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Figure 10. Pairings of predetermined belief versus no predetermined belief 


None of the eight pairs are statistically indistinguishable. When comparing the 
interdiction plans, the plans chosen by the CE Asset Allocation Algorithm where the case 
only differs by the initial belief are very different; see Tables 7 — 14. The plans against a 
DTO with a predetermined belief allocate most if not all of the assets on Route 1 and 
Route 2. Since a DTO with a predetermined belief will first try these routes, the 
Coalition Force gets the most benefit with assets on these routes. The plans against a 
DTO that does not have a predetermined belief disperses the assets among all of the 
routes, since the DTO has an equal chance at initially trying any of the routes. The initial 
belief for the DTO plays a role when the CE Asset Allocation Algorithm chooses the 


interdiction plan. 


3. The DTO’s Perspective 

In the following analysis, we compare the eight cases, which are defined in Table 
5, from the DTO’s perspective. With eight cases, there are 8-7/2=28 combinations. To 
be 95% confident in our findings involving twenty-eight combinations, we use a 
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confidence level of 100%(1—0.025/28) for the individual sub cases. To get the average 
number of discounted successes for Case 1, we take an average over the percentages of 


discounted successes for the sub cases that involve Case 1. 


a. Discounted Successes 


Figure 11 displays the average percentage of discounted successes by 
case. The only cases that are statistically indistinguishable are Case 8 with Case 4 and 


Case 3 with Case 7. 
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Figure 11. Average Percentage of Discounted Successes by Case 


Case 2 is the best case for the DTO. Case 2 is a patient DTO that chooses 
the routes optimally and does not have a predetermined belief. Exploring the data 
further, the top four cases have one common feature; the DTO does not have a 
predetermined belief. This allows the DTO to be flexible and find the best route. If this 
DTO is playing against a plan created to combat a predetermined belief then he can easily 
find the uncovered routes. On the other hand, if this DTO is playing against a plan that 
was created without a predetermined belief then the limited interdiction assets are spread 


across the routes, which limit the probability of detecting a vessel on each route. 
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b. Finding the Best Route 


Figure 12 is the average time period it took the DTO to find the best route. 
For each case, we take an average of the time periods that the DTO finds the best route of 
the associated eight sub cases. The only cases that are statistically indistinguishable are 


Case 3 with Case 7 and Case 7 with Case 1. 
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Figure 12. The average time the DTO found the best route given he found the best route 


Even if the DTO finds the best route quickly when he finds the best route, 
does not mean he will have a high average percentage of discounted success. Case 2 has 
the highest average percentage of discounted successes and a very low time to find the 
best route. Case 6, though, has the second highest average percentage of discounted 
successes and the second highest time to find the best route. This disparity has two 
possible explanations. The first possible explanation is that even if the best route was not 
found does not mean the DTO did not find a very good route. The DTO would thus not 
do as well as he could have if he found the best route, but could still do very well. The 
second possible explanation is the collection of the data. The DTO did not always find 
the best route. There is no data for those cases, so the information is skewed. To help 


mitigate this problem Figure 13 shows the average percentage of the number of times the 
DTO finds the best route. 
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Figure 13. Average Percentage the DTO finds the best route 


As Figure 14 shows, though, the more times that the DTO finds the best 
route, even if he finds it late, the higher the percentage of discount successes. Figure 14 


also confirms that Case 2 is the best for the DTO. 
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Figure 14. Trend of discount successes with finding the best route 


c. SUMMARY 


While planning, the Coalition Force needs to take into account a DTO that is 
capable of learning the location of assets. The pace of the DTO leaning the location of 


the Coalition Force is dependent on the DTO’s characteristics. By determining which 
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characteristics of the DTO to center on, the Coalition Force can focus its intelligence 


gathering efforts and ultimately make a more informed decision on the interdiction plan. 


The primary characteristic to focus on is the DTO’s initial belief. If the DTO has 
a predetermined belief and the Coalition Force can ascertain that belief, then they may be 
able to ignore some possible routes and focus their efforts in a particular region. This is 


currently being seen, with the majority the SPSSs found in the Eastern Pacific. 


The other characteristic to be aware of is the patience of the DTO. Does the DTO 
feel that he quickly needs to make successful runs? If he does, then he will be more 
willing to use the routes that he currently knows about. For instance, a patient DTO will 
be willing to try the SPSS in the Caribbean Sea, while an impatient DTO will stick with 


using the Eastern Pacific. 


The one aspect about the DTO that does not need to be considered is how he 
chooses his routes, optimally or heuristically. The choice will affect the number of 


discounted successes, but not the development of the interdiction plan. 


The length of time that the interdiction plan should be used varies greatly and is 
not an easy question to answer. One way to try to answer it is with a worst-case scenario. 
The DTO’s best case (and thus worst case for the Coalition Force) is Case 2. Case 2 
finds the best route on average in about 40 time periods. From the time that the DTO 
finds best route until the Coalition Force changes its interdiction plan, the DTO will 
maximize his expected number of successes. To keep the DTO from maximizing his 
expected number of successes the Coalition Force needs to change their interdiction plan. 
By changing the allocation of the Coalition Force’s assets before the DTO can take 
advantage of learning the best route, the Coalition Force can increase the number of 


SPSSs that they interdict. 
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V. CONCLUSION AND RECOMMENDATIONS 


A. CONCLUSION 


This thesis focuses on a portion of the drug trafficking problem. It looks at the 
shipment of cocaine from South America into the United States using SPSS. The SPSSs 
are difficult to detect, and they can carry a huge amount of cargo. Currently the cargo is 
only drugs, but there is potential for the cargo to be other contraband items. The 


Coalition Force needs to develop better ways to stop the use of the SPSS. 


In this thesis, we construct models and then develop algorithms to gain insight on 
a DTO that is capable of learning the placement of interdiction assets. We first use our 
algorithms to develop interdiction plans given different assets based on a DTO that is 
capable of learning and adapting. We then use these algorithms to analyze the how well 
a DTO performs against the plans. By analyzing how a DTO, that is capable of learning, 
reacts to different optimal plans, we can gain a better understanding of how the Coalition 
Force can more effectively allocate their limited number of assets to impede the 
effectiveness of the SPSS. One such insight is the interdiction plan is not dependent on 
the technique the DTO uses to choose his routes. Other attributes such as his patience 
and prior belief of success on the routes play a more substantial role in designing the 
interdiction plan. Another insight is the time it takes the DTO to find the route with the 
highest probability of success. In the Coalition Force’s worst-case scenario, a patient 
DTO that chooses the routes optimally and does not have a prior belief of success takes 


about forty tries. This is the time the interdictor should change his interdiction plan. 


B. SUGGESTED WORK AHEAD 


Since the DTO has more than one vessel that he uses, an extension of the thesis 
should allow the evader to choose multiple vessels. Each vessel costs a different amount 
to build, has a different probability of detection, and a different payout if the vessel is 


successful. 


In a possible extension, the DTO’s decisions could be split into two phases. The 


first phase is the planning phase. In the planning phase, the DTO decides which vessels 
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to build. The second phase is the deployment phase. In each time period of this phase, 
the DTO sends a vessel over a route to get the maximum expected number of discounted 


SUCCESSES. 


Allowing the DTO to have multiple vessels will let us have an insight into which 
vessel is the most beneficial to the DTO. The Coalition Force can then focus their efforts 


on stopping that vessel to best impede the drug trade. 
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APPENDIX A. PROBABILITY OF DETECTION 


This appendix presents the mathematical tools used to calculate the probability of 
detection for each asset and the probability of success for the evader on each route. It 


also describes how the probabilities were calculated for the case study. 


A. BARRIER PATROL 


Since evader’s vessels are traversing the routes, the interdictor’s assets are going 
to perform a barrier patrol across their assigned routes. The assumptions for a barrier 
patrol are the evader’s probability of detection does not change and the time the evader 
transits the barrier is unknown, but equally likely throughout the time interval under 
consideration (Wagner et al., 1999). To calculate the probability of detection some 
parameters need to be known. One of which is width of the route that is being patrolled 
which is d;. The speed of the interdictor’s asset of type j (v,) and the speed of the 
evader’s vessel (uw) also need to be known. We denote the sweep width associated with 
the interdictor’s asset on route i as w,,. Depending on the method chosen to calculate 


the probability of detection, more parameters may need to be known. 


1. Inverse Cube Law 


If the speed of the interdictor’s asset of type j is much greater than the evader’s 


speed then the inverse cube law can be used to calculate the asset’s probability of 
detecting the evader. This is common when the interdictor asset is an aircraft and the 
evader’s vessel is a ship. For the inverse cube law, two other parameters are used. The 


first is the number of assets that are in the barrier patrol (7). The last parameter needed 
is the track spacing (s). The probability of detecting an evader on route i with an 


interdiction asset of type j is 


b,, = 2), eddy (B.1) 
where g is the standardized, normal probability density function with mean zero and 
variance one, and (see (Wagner et al., 1999)) 


4] 


mW; ; 
= {ot B.2 
ae ieee (B.2) 


Rene sel Ga (B.3) 


and 





2. Linear Law 


If the speed of the interdictor’s asset of type j is about the same as the evader’s 
speed then the linear law can be used to calculate the asset’s probability of detecting the 
evader. This is common when the interdictor asset is a ship and the evader’s vessel is a 
ship. The probability of detecting an evader on route i with an interdiction asset of type 


j is (see (Wagner et al., 1999)) 


Vb’ +1-1 : 1 ; 
b= 1 [ - h(i)’ if b<2,Jh(h+1) (B.4) 





1, otherwise 
where 
Vv. 
b= (B.5) 
u 
and 
d.—w.. 
h= (4-3) (B.6) 
W. 


B. PROBABILITY OF DETECTING A VESSEL ON A ROUTE 


Once the interdictor has decided on an interdiction plan X , then the actual 
probability of detecting a vessel on each route needs to be computed. We assume the 
probability of detecting a vessel is independent among the interdictor’s assets. We also 
assume that when a vessel is detected, it is captured. Using these two assumptions, the 


probability that an evader is transits route i successfully for an interdiction plan X is 


m 


p(X)=]]d-4,)" Vi (A.1) 


j=l 
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C. CASE STUDY’S PROBABILITIES 


1. Routes 


There are five routes for the case study presented in Chapter IV. Each of these 
routes has a width associated with it. These routes are in either the Eastern Pacific or the 


Caribbean Sea. Table 15 displays this information for the routes. The widths are in 


nautical miles. 


| 5 | 80 | Caribbean Sea 


Table 15. Width and area for each route in the case study 





As seen in Figure 15, the two locations have different sea surface winds. For this 
thesis, the Eastern Pacific has winds that are approximately 10 knots (5.14 
meters/second) and the Caribbean Sea has winds that are approximately 15 knots (7.72 


meters/second). 


‘ = Histogram of Data in selected area 
SSM/I rt 5, Average of month: 2010—Feb Sectitike 3470 
Surface Wind Speed, Zoom Factor = 2 


Min: 0.60 2313 
© & #10 158 20 25 30 Max: 21.60 re 
i Mean: 6.39 
meters / second Rms: 2.10 i) “ 


te) 5 10 











Figure 15. Average winds for February 2010 (From Remote Sensing System, 2010) 
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2. Evader 


In the case study, the DTO has an unlimited supply of SPSSs that he wants to use 
to ship his drugs. The SPSS is going to traverse the route at 6 knots (u =6 ), which is the 
same assumption that Pfeiff (2009) makes. This speed is a little less than half of the 


maximum speed of an SPSS. 


3. Interdictor 

In the case study, the coalition force has three asset types (m=3). There are four 
Navy P-3 aircraft (j =1), two Navy E-2 aircraft (j =2), and two Coast Guard Cutters 
(j=3). The Navy P-3s will search for the SPSSs at a speed of 180 knots (v, =180), 
which is the same assumption that Pfeiff (2009) makes. They will not be on station 
searching for the whole time due to fuel requirements, crew rest, and limited flight hours. 
For this reason the number of aircraft per P-3 will be a fraction (7, =\4), since the 
aircraft will be on station for only a fraction of the time. The P-3s that are deployed in 
JIATF-South’s AOR are equipped with the APS-137 radar. Since the coalition force are 
looking for SPSSs, which have a small cross section above the water, the sweep widths 
for the P-3s are 8.6 NM for the Eastern Pacific routes and 3.1 NM for the Caribbean Sea 


routes. This means w,, =w,, =8.6and w,,=w,,=w,,=3.1. These numbers are pulled 


from Table 16 for the “4 to 10 person life raft” and winds “to 10” and “to 15.” 


Table H-27 Sweep Widths for Forward-Looking Airborne Radar (AN/APS-137) 














































































































16 Nautical Mile Radar Range Scale 
(Sweep Width in Nautical Miles) 
On scene Surface Winds (kts) 
Object Type <5 to10 | tol5 | to20 | to25 | to35 | to45 | to55 | to 65 > 65 
4 to 10 person life raft 12.1 8.6 3.1 0 0 0 0 0 0 0 
17 to 25 foot recreational boat 13.6 11.9 8.2 2.8 0 0 0 0 0 0 
26 to 35 foot recreational boat 16.6 16.3 15.4 14.2 12.6 9.5 3.9 0 0 0 
36 to 50 foot recreational boat 21.0 20.7 19.9 18. 17.5 14.7 9.8 3.5 0 0 
32 Nautical Mile Radar Range Scale 
(Sweep Width in Nautical Miles) 
On scene Surface Winds (kts) 
Object Type <5 to 10 to 15 to 20 to 25 to 35 | to 45 to 55 to 65 > 65 
17 to 25 foot recreational boat 17.4 15.7 12.0 6.6 0 0 | 0 0 0 0 
26 to 35 foot recreational boat 22.1 21.7 20.9 19.7 18.1 14.9 | 9.3 2.1 0 0 
36 to 50 foot recreational boat 29.0 28.7 27.9 26.9 25.5 22.7 | 17.8 11.5 3.8 0 
Table 16. APS-137 sweep widths (From USCG Addendum, 2004) 
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The Navy E-2s will search for the SPSSs at a speed of 180 knots (v, =180),which 
is the same assumption that Pfeiff (2009) makes. The E-2s will also not be on station 
searching for the whole time due to the same limitations as the P-3s. For this reason the 
number of aircraft per E-2 will also be a fraction (7, =1/7). The E-2s are equipped with 
the APS-145 radar. Using a factor of 2.5 to the P-3’s sweep width, we obtain the E-2’s 
sweep width since the APS-145 antenna area is roughly 20 times the APS-137 antenna 
area as explained by Pfeiff (2009). The sweep widths for the E-2s are 21.5 NM for the 
Eastern Pacific routes and 7.8 NM for the Caribbean Sea routes. This means 


W.. =W>, =21.5and w,,=w,, =w;, =7.8. 


We assume that the Coast Guard Cutter will search for the SPSSs at a speed of 15 
knots (v; =15). Coast Guard Cutters are equipped with the SPS-73 radar. Looking at 
Table 17, the sweep width for a raft with a reflector (such as the exhaust tubing on the 
SPSS) and in moderate rain (worst case scenario) is 4.7 NM and 1.7 NM for winds “to 
10” and “to 15” respectively. This means w,, =w,, =4.7and w,, =w,; =W;, =1.7. 


Table H-25 Sweep Widths and Recommended Settings for AN/SPS-73 Radar (4-10 person life rafts with and 
without radar reflectors) 





SWEEP WIDTHS FOR AN/SPS-73 RADAR 
(Nautical Miles) 


On scene Surface Winds (kts) 























WEATHER OBJECT TYPE >is 
No Rain or Raft w/ reflector : : 5. unknown 
Drizzle Raft w/o reflector 5. 2 E nil 
. Raft w/ reflector : 7 7 unknown 
Moderate Rain - : 
Raft w/o reflector 3 5 3 nil 
Range Scale: 6 NM range scale 
Pulse Width: M1 pulse width (AUTO) 
RECOMMENDED STC: Zero 
SETTINGS FTC: Less than 80% for no rain, at least 80% 
for rain 
Persistence: No higher than 15 
Interference Rejection: ON at 100% 








Table 17. APS-137 sweep widths (From USCG Addendum, 2004) 
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Substituting the above values into Equation B.1 for the P-3 and E-2 and into 
Equation B.4 for the Coast Guard Cutter returns the probability of detection for the assets 


on each route, as seen in Table 6. 
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APPENDIX B. GITTINS INDEX TABLES 


This appendix presents the equation that estimates Gittins Index (Equation 3.4) 
and pre-calculated Indices for a=0.50 (Tables 18 — 19) and a=0.99 (Tables 21 — 23). 


Equation 3.4 is valid when a=0.50 and a@>20 and #>20. It is also 
valid when a=0.99 and a+ #>40. In Equation 3.4 v(a@, @) is Gittins Index. 
- 
|v(a,A)-a*(a+ py" | =(a+A)*ar?* B+ A+B* a+ )+C*(a+f) ')*(I-a)"" (A.1) 


The values for A, B, C are also in Gittins’ book (Gittins, 1989). They are Table 20 for 
a=0.50 and Table 24 for a=0.99. 


— 


p 
1 2 3 4 5 6 7 8 9 10 
1 [0559] 03758 | 02803 
2) 0.706 | 0.5359 | 0.4298 
3) 0.7772 | 0.6289 | 0.5258 
4] 0.8199 | 0.6899 | 0.5937 
s[ 0.8485 | 0.7333 | 0.6441 
6| 0.8691 | 0.7658 | 0.6832 
70.8847 | o7911 | 0.7144 
8} 0.8969 | 0.8114 | 0.7399 | 0.6795 | 0.6279 | 0.5835 | 0.5447 | 0.5107 | 0.4807 | 0.4541 
= 100.9148 | 0.8419 [0.7791 

110.9216 | 0.8537 | 0.7946 
120.9274 | 0.8638 | 0.808 
13[_0.9324 | 0.8726 | 0.8197 
14[ 0.9376 | 0.8804 | 0.8301 
150.9405 | 0.8872 | 0.8395 
160.9439 | 0.8933 | 0.8476 
170.9465 | 0.8988 | 0.855 
18{ 0.9496 | 0.9037 | 0.8618 
190.952 | 0.9081 | 0.8679 
20[70.9542 | 0.9122 | 0.8736 


Table 18. Pre-Calculated Gittins Index when a=0.50 Part 1 (From Gittins, 1989) 
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B 

















11 12 13 14 15 16 17 18 19 20 
1| 0.0888 | 0.0816 
2] 0.1619 | 0.15 0.1307 | 0.1228 | 0.1158 
3| 0.2236 | 0.2084 0.1833 | 0.1729 | 0.1636 
4] 0.2764 | 0.2589 0.2297 | 0.2174 | 0.2064 
5|_0.3223 | 0.3032 
6{_0.3627 | 0.3423 
7|_0.3983 | 0.3773 
8|_0.4302 | 0.4086 
9] 0.4587 | 0.4369 0.3988 | 0.3821 | 0.3668 

~ 10] 0.4845 | 0.4625 0.424 | 0.407 | 0.3913 
11| 0.5079 | 0.4859 | 0.4657 | 0.447 | 0.4298 | 0.4139 | 0.3991 | 0.3853 | 0.3724 | 0.3603 
12| 0.5294 | 0.5073 0.4683 | 0.451 | 0.4349 
13| 0.549 | 0.527 0.488 | 0.4706 | 0.4544 
14] 0.567 | 0.5453 0.5063 | 0.4888 | 0.4726 
15|_0.5837 | 0.5621 
16|_0.599 | 0.5777 
17/ 0.6133 | 0.5922 
18|_0.6266 | 0.6058 
19| 0.639 | 0.6185 
20/_ 0.6506 | 0.6304 


Table 19. Pre-Calculated Gittins Index when a=0.50 Part 2 (From Gittins, 1989) 


Qa 
| am || ele 
| 0.6 | 5.0689 | 3.8097 | -0.839 
| 0.8 | 7.6929 | 4.2241 | -2.383_ 
| 0.9 11.906 | 5.1853 | -5.885_ 


Table 20. Pre-Calculated A, B, and C when a=0.50 (From Gittins, 1989) 
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Pre-Calculated Gittins Index when a=0.99 Part 1 (From Gittins, 1989) 
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14 15 16 17 18 19 20 21 2 23 24 25 26 
10.1491 | 0.1384 | 0.129 | 0.1206 | 0.1132 | 0.1066 | 0.1006 | 0.0952 | 0.0903 | 0.0858 | 0.0817 | 0.078 | 0.0745 
2] 0.2228 | 0.2086 | 0.196 | 0.1847 | 0.1746 | 0.1654 | 0.157 | 0.1494 | 0.1425 | 0.1361 | 0.1302 | 0.1248 | 0.1198 
3{_0.2799 | 0.2637 | 0.2491 | 0.2359 | 0.2239 | 0.213 | 0.2031 | 0.194 | 0.1856 | 0.1778 | 0.1707 | 0.164 | 0.1578 
4] 0.3274 | 0.3097 | 0.2938 | 0.2792 | 0.2659 | 0.2539 | 0.2428 | 0.2325 | 0.2231 | 0.2142 | 0.2061 | 0.1985 | 0.1914 
5| 0.3677 | 0.3491 | 0.3324 | 0.317 | 0.3028 | 0.2898 | 0.2778 | 0.2666 | 0.2564 | 0.2468 | 0.2379 | 0.2295 | 0.2217 
6| 0.403 | 0.3839 | 0.3663 | 0.3501 | 0.3355 | 0.3218 | 0.3092 [0.29794] 0.2864 | 0.2762 | 0.2666 | 0.2577 | 0.2493 
7|_0.434 | 0.4145 | 0.3967 | 0.3801 | 0.3648 | 0.3505 | 0.3374 | 0.3252 | 0.3138 | 0.3031 | 0.293 | 0.2836 | 0.2747 
8| 0.4615 | 0.442 | 0.4238 | 0.407 | 0.3914 | 0.3769 | 0.3633 | 0.3505 | 0.3387 | 0.3277 | 0.3173 | 0.3074 | 0.2981 
9| 0.4862 | 0.4666 | 0.4484 | 0.4314 | 0.4156 | 0.4009 | 0.387 | 0.3741 | 0.3619 | 0.3503 | 0.3396 | 0.3295 | 0.3199 
10| 0.5083 | 0.4889 | 0.4707 | 0.4537 | 0.4378 | 0.4228 | 0.4088 | 0.3957 | 0.3833 | 0.3716 | 0.3605 | 0.35 | 0.3402 
11] 0.5288 | 0.5091 | 0.4911 | 0.4741 | 0.4581 | 0.4431 | 0.429 | 0.4157 | 0.4031 | 0.3913 | 0.3801 | 0.3694 | 0.3593 
12] 0.5477 [0.528 | 0.5096 | 0.4929 | 0.4769 | 0.4619 | 0.4477 | 0.4343 | 0.4216 | 0.4096 | 0.3983 | 0.3875 | 0.3772 
= 130.565 | 0.5456 | 0.5273 | 0.51 | 0.4943 | 0.4793 | 0.4651 | 0.4517 | 0.4389 | 0.4268 | 0.4153 | 0.4044 | 0.3941 
14] 0.5808 | 0.5617 | 0.5436 | 0.5265 | 0.5103 | 0.4955 | 0.4814 | 0.4679 | 0.4551 | 0.443 | 0.4314 | 0.4204 | 0.4099 
15 0.4831 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
Table 22. Pre-Calculated Gittins Index when a=0.99 Part 2 (From Gittins, 1989) 
1 
2 
3 
4 
5 
6 
37 
8 
9 
10 
uu 
2 
13 
Table 23. Pre-Calculated Gittins Index when a=0.99 Part 3 (From Gittins, 1989) 


50 





Qa 
(a +B) ms . 


| 0.6 | 24.962 | 1.5745 | -150.2_ 
| 08 | 31.032 | 1.5892 | -151.9 | 
| 0.9 [38.459] 1.62 | 126.6. 


Table 24. Pre-Calculated A, B, and C when a=0.99 (From Gittins, 1989) 
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APPENDIX C. TOTAL NUMBER OF POSSIBLE PLANS 


The interdictor’s goal is to minimize the evader’s expected number of successful 
trips. He does this by implementing an interdiction plan. The total number of possible 
plans for the interdictor to choose from is dependent on the number of assets and the 
number of routes. The number of possible plans could be small if there are not many 
assets or routes. For instance with one asset and n routes then there are only n possible 
plans. With any number of assets, but only one route then there is one possible plan. The 


number of possible plans, though, explodes as the number of routes and assets increases. 


A. USING A TREE DIAGRAM 

To compute the total number of plans, first consider each asset type separately. 
For instance, we will look at computing the number of plans for only one asset type, /. 
The two parameters that will affect the quantity of the plans for this asset type are the 
number of assets (7, ) and the number of routes (7). When figuring out how many assets 
to place on the first route there are r, +1 possible choices. This extra choice is because 


the interdictor could place zero assets on the first route. The options for the assets on the 


second route are r,+1 minus the assets placed on the first route and so forth. When 


looking at the number of routes, there are n—1 degrees of freedom, since the n™ route 
will get whatever assets are left. This type of setup lends itself to a tree diagram, which is 
n—1 deep. The number of plans will be the number of leaves. Figures 16 — 18 are tree 


diagrams for two assets (1, = 2) with two, three, and four routes, respectively. 
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rj, = number of 


2 assets assets of type j 


2 routes on route i. 


Route 2 gets what is left. 





Figure 16. There are three possible plans with two assets and two routes 


2 t 1, = number of 
assets assets of type j 
3 routes on route i. 


Route 3 gets what is left. 





Figure 17. There are six possible plans with two assets and three routes 
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2 t : rj = number of 
ales assets of type j 
4 routes on route i. 





Route 4 gets what is left. 


Figure 18. There are ten possible plans with two assets and three routes 


B. USING AN ITERATIVE TABLE 


Another way to calculate the number of plans is by setting up a table. Table 25 is 
an example of this for two assets. The rows of the table are the number of routes, starting 


at two. The column headers are the number of possible choices (7, +1 to one). The first 
row is populated with zeros except for the first column (7; +1) which has a one. The rest 
of the first column (7; +1) is also populated with ones. To fill in the rest of the table is a 


top down, left to right process. Since the first row and first column are already filled in 


the next block to populate is for three routes and column r,. To fill this block in, add the 


number above the block with the number on the left of the block. Repeat this process for 
the rest of the table until it is complete. To calculate the number of possible plans for a 
route, multiply the number in the row for that route with its column header and add those 
products together. For instance with four routes and two assets, (as seen in Table 25), 
multiply the first column header by the number in the first column (3-1), then add that to 
the answer when the second column header is multiplied to the second column (2-2 ), 
and finally add that to the answer when the third column header is multiplied to the third 
column (1-3). The final answer is 10. Therefore, with four routes and one asset type 


with two assets there are only ten possible plans for the interdictor. 


a) 


Number of Possible Choices 
Total Routes Total Plans 
2 





Table 25. The number of plans for two assets 


C; MULTIPLE ASSET TYPES 


If there is more than one asset type then each asset type is considered separately 
then those numbers are multiplied together to get the number of plans. For example, an 
interdictor is trying to cover four routes (1 =4) with four asset types (m= 4) each with a 
possible different number of assets available (r,). With 7,=2, there are ten possible 
plans when just considering j=1. With r, =4, there are thirty-five possible plans when 
just considering j=2. With 7, and r, both equal to one, there are four possible plans 


for each asset type. To get the total number of possible defense plans of 5,600, multiply 
10 by 35 by 4 by 4. 
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APPENDIX D. CASE STUDY DATA 


Appendix D contains the results of the sixty-four sub cases. Each figure is a box 
plot* of one case and the corresponding eight plans. The box plots show the maximum 
value, the 95" percentile upper bound, the mean, the 95"" percentile lower bound, and the 


minimum value of the discounted number of successes for the sub cases. 


A. DISCOUNTED NUMBER OF SUCCESSES 


Figure 19 gives the discounted number of successes for Case 1. The 95% 
confidence interval is small for each plan against Case 1. The two best plans for the 
Coalition Force are Plan 1 and Plan 5 with a discounted number of successes of about 50. 
We note that the CE Asset Allocation Algorithm suggests Plan 1 for Case 1. The two 
worst plans for the Coalition Force are Plan 6 and Plan 8 with a discounted number of 


successes of about 72. 
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Figure 19. | Discounted Number of Successes for Case 1 


4 Box plots were created in Excel with the help of bloggpro (2007). 
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Figure 20 gives the discounted number of successes for Case 2. The 95% 
confidence interval is small for each plan against Case 2. The two best plans are Plan 2 
and Plan 6 with a discounted number of successes of about 73. We note that the CE 


Asset Allocation Algorithm suggests Plan 2 for Case 2. 
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Figure 20. Discounted Number of Successes for Case 2 
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Figure 21 gives the discounted number of successes for Case 3. The two best 
plans are Plan 3 and Plan 7 with a discounted number of successes of about 0.14. We 
note that the CE Asset Allocation Algorithm suggests Plan 3 for Case 3. The two worst 


plans are Plan 2 and Plan 6 with a discounted number of successes of about 0.70. 
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Figure 21. Discounted Number of Successes for Case 3 
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Figure 22 gives the discounted number of successes for Case 4. There is a huge 
variation in the minimum and maximum values, but on average all of the plans perform 


relatively the same with a discounted number of successes of about 0.80. 
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Figure 22. _ Discounted Number of Successes for Case 4 
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Figure 23 gives the discounted number of successes for Case 5. The 95% 
confidence interval is small for each plan against Case 5. The two best plans are Plan 1 
and Plan 5 with a discounted number of successes of about 60. We note that the CE 
Asset Allocation Algorithm suggests Plan 5 for Case 5. The worst plan is Plan 7 with a 


discounted number of successes of about 60. 
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Figure 23. Discounted Number of Successes for Case 5 
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Figure 24 gives the discounted number of successes for Case 6. The 95% 


confidence interval is small for each plan against Case 6. The two best plans are Plan 2 


and Plan 6 with a discounted number of successes of about 72. We note that the CE 


Asset Allocation Algorithm suggests Plan 6 for Case 6. 
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Figure 24. 


Discounted Number of Successes for Case 6 
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Figure 25 gives the discounted number of successes for Case 7. The two best 
plans are Plan 3 and Plan 7 with a discounted number of successes of about 0.13. We 
note that the CE Asset Allocation Algorithm suggests Plan 7 for Case 7. The two worst 


plans are Plan 2 and Plan 6 with a discounted number of successes of about 0.69. 
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Figure 25. Discounted Number of Successes for Case 7 
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Figure 26 gives the discounted number of successes for Case 8. All of the plans 
perform relatively the same, except Plan 3 and Plan 7, which have a discounted number 


of successes of about 0.84. 
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Figure 26. Discounted Number of Successes for Case 8 


B. THE TIME THE DTO FOUND THE BEST ROUTE 


The results for the time that the DTO finds the best route for the sixty-four sub 
cases are in Figures 27-34. The minimum time period is 1, if the DTO finds the best 
route with his first decision and continues to use the same route. The maximum time 
period is 298, since both the Gittins Choice Algorithm, and Decreasing Choice Algorithm 
run for 298 time periods. If the DTO does not find the best route then there is no result to 


collect. Therefore, each sub case will have a different number of observations. 
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Figure 27 gives the time the DTO finds the best route for Case 1. The time period 
that the DTO finds the best route is divided into three groups. The DTO either finds the 
best route in about time period 200 (Plan 1 and Plan 4), time period 100 (Plan 3, Plan 6 
and Plan 7, or did not find the best route (Plan 2, Plan 5, and Plan 8). 
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Figure 27. The time the DTO found the best route given he found the best route for Case 
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Figure 28 gives the time the DTO finds the best route for Case 2. The time period 
that the DTO finds the best route is divided into two groups. The DTO either finds the 
best route in about time period 7 (Plan 1, Plan 3, Plan 4, Plan 5, and Plan 7) or he finds 
the best route in time period 80 (Plan 2, Plan 6, and Plan 8). 
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Figure 28. The time the DTO found the best route given he found the best route for Case 
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Figure 29 gives the time the DTO finds the best route for Case 3. The DTO rarely 
finds the best route in Case 3. Against Plan 1, Plan 6 and Plan 7 does the DTO find the 


best route. 
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Figure 29. The time the DTO found the best route given he found the best route for Case 
3 


67 


Figure 30 gives the time the DTO finds the best route for Case 4. The DTO 


always finds the best route in Case 4 at approximately time period 10. 
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Figure 30. The time the DTO found the best route given he found the best route for Case 
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Figure 31 gives the time the DTO finds the best route for Case 5. The DTO finds 


the best route in all of the plans, but it takes him on average to time period 265. 
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Figure 31. The time the DTO found the best route given he found the best route 
for Case 5 


69 


Figure 32 gives the time the DTO finds the best route for Case 6. The DTO finds 


the best route in all of the plans, but it takes him on average to time period 258. 
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Figure 33 gives the time the DTO finds the best route for Case 7. The DTO rarely 
finds the best route in Case 7. He only finds the best route against Plan 1, Plan 6 and 


Plan 7. 
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Figure 33. The time the DTO found the best route given he found the best route for Case 
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Figure 34 gives the time the DTO finds the best route for Case 8. The DTO finds 


the best route in all of the plans, but it takes him on average to time period 10. 
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